An Efficient Algorithm for Topic Ranking and Modeling Topic Evolution

نویسندگان

  • Kumar Shubhankar
  • Aditya Pratap Singh
  • Vikram Pudi
چکیده

In this paper we introduce a novel and efficient approach to detect and rank topics in a large corpus of research papers. With rapidly growing size of academic literature, the problem of topic detection and topic ranking has become a challenging task. We present a unique approach that uses closed frequent keywordset to form topics. We devise a modified time independent PageRank algorithm that assigns an authoritative score to each topic by considering the sub-graph in which the topic appears, producing a ranked list of topics. The use of citation network and the introduction of time invariance in the topic ranking algorithm reveal very interesting results. Our approach also provides a clustering technique for the research papers using topics as similarity measure. We extend our algorithms to study various aspects of topic evolution which gives interesting insight into trends in research areas over time. Our algorithms also detect hot topics and landmark topics over the years. We test our algorithms on the DBLP dataset and show that our algorithms are fast, effective and scalable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Topic Modeling Approach to Rank Aggregation

We propose a new model for rank aggregation from pairwise comparisons that captures both ranking heterogeneity across users and ranking inconsistency for each user. We establish a formal statistical equivalence between the new model and topic models. We leverage recent advances in the topic modeling literature to develop an algorithm that can learn shared latent rankings with provable statistic...

متن کامل

Ranking Authors with Learning-to-rank Topic Modeling

Topic modeling has emerged as a popular learning technique not only in mining text representations, but also in modeling authors’ interests and influence, as well as predicting linkage among documents or authors. However, few existing topic models distinguish and make use of the prior knowledge in regard to the different importance of documents (authors) over topics. In this paper, we focus on ...

متن کامل

اولویت‌بندی معیارهای انتخاب موضوع پایان‌نامه با روش تحلیل سلسله مراتبی (AHP) از دیدگاه دانشجویان دکتری

Background and Aim: Choosing thesis topic is one of the most important decisions of postgraduate students and many factors affect such decision. This study aimed to prioritize the criteria for choosing thesis topic from Ph.D. students’ viewpoint, using the analytic hierarchy process (AHP) and ranking methods. Materials and Methods: This analytical study was carried out on the School of Public ...

متن کامل

A review of text mining approaches and their function in discovering and extracting a topic

Background and aim: Four text mining methods are examined and focused on understanding and identifying their properties and limitations in subject discovery. Methodology: The study is an analytical review of the literature of text mining and topic modeling.  Findings: LSA could be used to classify specific and unique topics in documents that address only a single topic. The other three text min...

متن کامل

Criteria for Cluster-Based Personalized Search

We study personalized web ranking algorithms based on the existence of document clusterings. Motivated by the topic sensitive page ranking of Haveliwala [20], we develop and implement an efficient “local-cluster” algorithm by extending the web search algorithm of Achlioptas, Fiat, Karlin and McSherry [10]. We propose some formal criteria for evaluating such personalized ranking algorithms and p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011